Add requests monitoring #151

doringeman · 2025-09-12T13:40:30Z

Add requests monitoring. It uses docker/model-runner#157.

MODEL_RUNNER_PORT=8080 make run

Here:

$ make install

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model requests --help
Usage:  docker model requests [OPTIONS]

Fetch requests+responses from Docker Model Runner

Options:
  -f, --follow             Follow requests stream
      --include-existing   Include existing requests when starting to follow (only available with --follow)
      --model string       Specify the model to filter requests

$ MODEL_RUNNER_HOST=http://localhost:8080 docker model requests --include-existing -f --model ai/qwen3:0.6B-Q4_0
Connected to request stream. Press Ctrl+C to stop.
{"id":"sha256:df9f2a333a636ca3a290700759adc29fad54cb2dee5b2f198e1bce26101686eb_1757684269324900000","model":"ai/qwen3:0.6B-Q4_0","method":"POST","url":"/engines/v1/chat/completions","request":"{\"model\":\"ai/qwen3:0.6B-Q4_0\",\"messages\":[{\"role\":\"user\",\"content\":\"hi\"}],\"stream\":true}","response":"{\"choices\":[{\"finish_reason\":\"stop\",\"index\":0,\"message\":{\"content\":\"Hi! How can I assist you today? 😊\",\"reasoning_content\":\"okay, the user just said \\\"hi\\\". I should respond politely. Let me make sure to acknowledge their greeting.\\n\\nFirst, a simple \\\"hi\\\" is good. Then, maybe add something like \\\"Hello! How can I help you today?\\\" That gives a friendly response and opens up the conversation.\\n\\nI should keep it short and positive. Don't be too formal. Something like \\\"Hi! How can I assist you today?\\\" sounds natural and helpful.\\n\\nLet me check if there's any other way to respond. Maybe add a question to engage them further. Like, \\\"Are there any specific topics you'd like to discuss?\\\" That encourages interaction.\\n\\nYes, that works. So the response should be short, friendly, and open for further conversation.\",\"role\":\"assistant\"}}],\"created\":1757684270,\"id\":\"chatcmpl-cEC7vKkmKfxQEXA4Gr6hM0MlXefr5GGs\",\"model\":\"ai/qwen3:0.6B-Q4_0\",\"object\":\"chat.completion\",\"system_fingerprint\":\"b1-c610b6c\",\"timings\":{\"predicted_ms\":981.252,\"predicted_n\":166,\"predicted_per_second\":169.17162971387575,\"predicted_per_token_ms\":5.911156626506024,\"prompt_ms\":33.441,\"prompt_n\":9,\"prompt_per_second\":269.1307078137615,\"prompt_per_token_ms\":3.715666666666667},\"usage\":{\"completion_tokens\":166,\"prompt_tokens\":9,\"total_tokens\":175}}","timestamp":1757684269,"status_code":200,"user_agent":"docker-model-cli/dev"}

Note the completion added for --model.

TODO: Add --format.

Bumped for docker/model-runner@5449fc9. See all changes in docker/model-runner@5341c9f...38bb017. Signed-off-by: Dorin Geman <[email protected]>

Signed-off-by: Dorin Geman <[email protected]>

xenoscopic

LGTM, though I still agree with @ilopezluna that it'd be worth combining the two endpoints and differentiating on Accept if in docker/model-runner#157 if it's not too much work.

commands/requests.go

Differentiate regular and streaming based on the Accept Header. Signed-off-by: Dorin Geman <[email protected]>

doringeman · 2025-09-15T13:11:47Z

Combined the endpoints as discussed here in e806cc4. Thanks!

Signed-off-by: Dorin Geman <[email protected]>

…indows/arm64 Signed-off-by: Dorin Geman <[email protected]>

Copilot

Pull Request Overview

This PR adds a new requests command to the Docker Model CLI that fetches requests and responses from the Docker Model Runner, enabling monitoring of model inference activities. The implementation includes both one-shot and streaming modes with optional model filtering.

Adds docker model requests command with streaming and filtering capabilities
Updates dependencies to include model-runner support for request monitoring
Provides comprehensive documentation and CLI completion for the new feature

Reviewed Changes

Copilot reviewed 8 out of 89 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
go.mod	Updates model-runner and model-distribution dependencies to support request monitoring
docs/reference/model_requests.md	Adds documentation for the new requests command
docs/reference/model.md	Updates parent command documentation to include requests subcommand
docs/reference/docker_model_requests.yaml	Defines CLI specification for requests command options
docs/reference/docker_model.yaml	Updates parent command specification to include requests
desktop/desktop.go	Implements HTTP client method for fetching requests with streaming support
commands/root.go	Registers the new requests command with the CLI
commands/requests.go	Implements the requests command with streaming, filtering, and completion

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

desktop/desktop.go

commands/requests.go

Signed-off-by: Dorin Geman <[email protected]>

doringeman added 2 commits September 12, 2025 16:31

chore: bump model-runner

89218c5

Bumped for docker/model-runner@5449fc9. See all changes in docker/model-runner@5341c9f...38bb017. Signed-off-by: Dorin Geman <[email protected]>

feat: add requests monitoring

3fa1028

Signed-off-by: Dorin Geman <[email protected]>

doringeman mentioned this pull request Sep 12, 2025

feat: add streaming endpoint for inference requests docker/model-runner#157

Merged

xenoscopic approved these changes Sep 12, 2025

View reviewed changes

commands/requests.go Outdated Show resolved Hide resolved

refactor: combine the requests endpoints

e806cc4

Differentiate regular and streaming based on the Accept Header. Signed-off-by: Dorin Geman <[email protected]>

doringeman added 2 commits September 15, 2025 16:34

docs: add docker model requess

b1cf0b0

Signed-off-by: Dorin Geman <[email protected]>

deps: fix build by manually adding the patched go-winjob to support w…

5b21281

…indows/arm64 Signed-off-by: Dorin Geman <[email protected]>

ericcurtin requested a review from Copilot September 15, 2025 14:11

Copilot AI reviewed Sep 15, 2025

View reviewed changes

desktop/desktop.go Outdated Show resolved Hide resolved

commands/requests.go Show resolved Hide resolved

fix(requests): URL encode model filter parameter

2a43e46

Signed-off-by: Dorin Geman <[email protected]>

doringeman merged commit f064505 into docker:main Sep 17, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add requests monitoring #151

Add requests monitoring #151

Uh oh!

doringeman commented Sep 12, 2025

Uh oh!

xenoscopic left a comment

Uh oh!

Uh oh!

doringeman commented Sep 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add requests monitoring #151

Add requests monitoring #151

Uh oh!

Conversation

doringeman commented Sep 12, 2025

Uh oh!

xenoscopic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

doringeman commented Sep 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants